Taxonomy and Evaluation of Markers for Computational Stylistics

نویسندگان

  • Foaad Khosmood
  • Robert Levinson
چکیده

Currently, stylistic analysis of natural language texts is achieved through a wide variety of techniques containing many different algorithms, feature sets and collection methods. Most machine-learning methods rely on feature extraction to model the text and perform classification. But what are the best features for making style based distinctions? While many researchers have developed particular collections of style features – called style markers – no definitive list exists. In this paper we present an organized collection of such style markers with performance data on a diverse set of texts. We show that for each training document, one or more markers exist that can distinguish it from others, providing a basis for a weighted, combined set of markers that outperform any of the individual ones. We examine and categorize 502 style markers, both individually and as a set, and evaluate their performance on several English language text collections.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Stylistics using Artificial Neural Networks

Previous work in using Artificial Neural Networks for computational stylistics has concentrated on using large, arbitrary network structures. This paper examines the use of the Cascade-Correlation algorithm for the construction of minimal networks. We find that a number of problems in computational stylistics with a large number of variables, but a limited number of training examples may be sol...

متن کامل

Using the Taxonomy and the Metrics: What to Study When and Why; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”

Dukhanin and colleagues’ taxonomy of metrics for patient engagement at the organizational and system levels has great potential for supporting more careful and useful evaluations of this ever-growing phenomenon. This commentary highlights the central importance to the taxonomy of metrics assessing the extent of meaningful participation in decision-making by patients, consumers and community mem...

متن کامل

Content Evaluation of Iranian EFL Textbook Vision 1 Based on Bloom’s Revised Taxonomy of Cognitive Domain

Textbooks are considered as the common features of the classrooms and are important means to make contributions to curricula. Therefore, their contents are very essential to develop the adequate curriculum planning. A textbook analysis is a means by which different features of the textbooks can be analyzed and hence their effectiveness is validated. This study set out to evaluate the content of...

متن کامل

Metadiscourse Markers in the Discussion/Conclusion Section of Persian and English Master's Theses

Metadiscourse markers help writers make coherent and reader- friendly texts, thus of considerable importance in academic writing. The main aim of this study was to investigate how interactive and interactional metadiscourse markers are used by Iranian EFL learners. An inquiry was carried out to investigate cross-cultural similarities and differences in the use of metadiscourse markers in the Di...

متن کامل

Statistical stylistics al-Hadid and at-taghabun based on Johnson

Linguists of the late twentieth century has been paying particular attention to statistical stylistics. In this type of stylistics, texts based on statistical analysis and the results of its review, the unique features and benefits of a text or author or genre counts. Among the leading theorists in this field, Johnson is the theory of biological design vocabulary, style-statistical research ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011